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Abstract 

We consider a fixed quantum measurement performed over n identi- 
cal copies of quantum states. Using a rigorous notion of distinguisha- 
bility based on Shannon's 12th theorem, we show that in the case of 
a single qubit the number of distinguishable states is W(ai,ot2,n) = 
|qi — aa| wnere (01,0:2) is the angle interval from which the states 

are chosen. In the general case of an jV-dimensional Hilbert space and an 
area íl of the domain on the unit sphere from which the states are chosen, 
the number of distinguishable states is W(N,n,Q) = íl(||) — ~ . The 
optimal distribution is uniform over the domain in Cartesian coordinates. 

1 Introduction 

In his remarkable 1981 paper, "Statistical Distance and Hilbert Space" 
Q, Wootters showed that the statistical distance between two vectors 
in Hilbert space is proportional to the angle between these two vectors 
and does not depend on the position of the vectors. He defines statisti- 
cal distance as the number of distinguishable intermediate states between 
the two vectors. However, his notion of distinguishibility relies on the 
apparently arbitrary criterion that two states are distinguishable if mea- 
surements performed on n identical copies of each state yield two distribu- 
tions whose means are separated by a constant factor times the sum of the 
Standard deviations of these distributions. We use a more rigorous notion 
of distinguishability based on Shannon's 12th theorem pi and arrive at 
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an expression for the number of distinguishable states that is consistent 
with Wootters' result; however, unlike that result, our expression does not 
depend on an arbitrary choice of the distinguishability criterion. Rather, 
our notion of distinguishibility is predicated on the guarantee that the 
measurer be able to distinguish between the quantum states with prob- 
ability approaching 1 as the number n of copies of identical states in a 
sample tends to infinity. Wootters shows that for large n the number of 
distinguishable states between the vectors ai and ai is proportional to 
\ol\ — a.2\yfn, where a is the angle of the vector from some reference di- 
rection in the plane spanned by the two vectors. We show in Section 2 
that the actual number of distinguishable states in a 2-dimensional Hilbert 
space is 

W(a!,a 2 ,n) =e Isu P (P;K) = | ai -a 3 u/— (1) 

y ive 

where I S up(P; K) is the maximum mutual information between the (ran- 
dom) quantum state and the results of measurements. We prové that this 
maximum is achieved for an ensemble of quantum states with the uniform 
distribution of the angle a for any interval [as, ai]. The independence of 
the number of distinguishable states of the position of the interval [ct2, Qi] 
is a remarkable asymptotic property that does not hold for small vàlues 
ofn(cf. §). 

Section 3 of this paper provides a generalization of these results to the 
case of an 7V-dimensional Hilbert space of states of the quantum system. 
It turns out that the number of distinguishable states depends only on the 
area Q of the domain on the unit sphere from which the states can be cho- 
sen, but does not depend on the shape and position of this domain. The 
optimal distribution is uniform over this domain in Cartesian coordinates, 
and the number of distinguishable states is W(N, n, fi) = fi(§|) (JV ~ 1)/2 . 

2 The Case of a Single Qubit 
2.1 Formulation of the Problem 

Consider a quantum physical system whose states are unit vectors in a 
2-dimensional complex Hilbert space C 2 (the so-called "qubit"). Denote 
the state vector by v and let (<t>, v) be an orthonormal basis in the Hilbert 
space, so that v = a<t> + bv, where a — (v|«t>), b = (v|v) are inner products 
and \a\ 2 + \b\ 2 = 1. Then \a\ 2 — p and |6| 2 = 1 — p are probabilities of two 
possible outcomes of the measurement performed over the state v in the 
(<t>, vi/) basis. Obviously, these probabilities do not depend on the phases of 
the coeficients a and 6, and, therefore, all quantum states with the same 
magnitudes \a\ = x and |6| = y are indistinguishable by this measurement. 
Hence, the state space S 3 can be reduced to the non-negative quadrant 
of a circle in a real 2-dimensional Euclidean space (Fig. h|), spanned by « 
and v. Now let vi and V2 be two distinct state vectors, such that 

Vi = Xit> + yiV where xí = y/pï and %/i = yl—pï, for i = 1,2. (2) 
Denote by a» the angle between <t> and Vi, so that 

Pi=cos 2 aü, 1— pi = sm 2 oti, i = 1,2. (3) 
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y 2 = VI-P2 




x,= Vp". x 2 = 



Figure 1: The two state vectors vi, v 2 and their projections on the the basis 
elements <t> and w. 

Suppose, we want to distinguish between various quantum states cho- 
sen from the interval of angles [a.2, ai] by performing measurements in 
the (<i>, w) basis. Further, assume that we are allowed to perform the mea- 
surement over n identical copies of each quantum state. 

Problem: What determines the number of distinguishable states, and 
what is the asymptotic expression for the number of states in the interval 
[Q2, ai] that can be distinguished with probability approaching 1 when n 
tends to infinity? 

As shown in the next section, the problem can be rigorously analyzed 
by applying concepts and results of Shannon's information theory. 

2.2 Information-Theoretical Description 

Suppose the state vectors are chosen from the angle interval [0:2, ai] with 
certain probability density function (p.d.f.) P^(a), where A is a random 
variable that takes on vàlues from [0:2,0:1], a G [02, ai]. Let Pp(p) be 
the p.d.f. of the random variable P that takes on vàlues p, where p is 
the probability of the state vector to be projected as the result of the 
measurement onto basis vector <t>. Obviously, P = cos 2 A, and the value of 
P (or of A) characterizes uniquely the chosen quantum state. In a series of 
n measurements, let K be the (random) number of measurements which 
have resulted in projectios onto *. The conditional probability distribution 
of K given P is binomial: 




n—k 



where k = 0, 1, . . . , n . 



(4) 
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The vàlues of K obtained in the measurement are the only data avail- 
able from which one can infer about the value of P, i.e., about the choice 
of a quantum state. 

Let Pi<-(fe) be the marginal probability distribution of K. The infor- 
mation 1(K; P) in K about P is given by 

I(K; P)= r^Pp (p)P k/ p (k/p) ln dp . (5) 

J pi k=o K( ' 

The importance of considering information Ï(K, P) stems from Shan- 
non's 12th theorern Q| which, for our setting of the problem, can be 
rephrased in the following way. 

Let S = {p}, where p is an n-dimensional vector p — (p,p, . . . ,p) and 
p £ [pi,P2] be the set of all possible input signals and Z n = {0, 1, . . . , n} be 
the set of all output signals in a communication channel with a conditional 
probability distribution given by (^) . Let L be the length of a sequence of 
such input signals used independently. Then for any e > the maximum 
number M(L, e) of input signals that can be chosen from S in such a way 
that the probability of error (incorrect decision about p based on the value 
of the output signal k £ Z n ) does not exceed e satisíies the asymptotic 
property: 

"lnM(Z,,e) 



lim 

L — *oo 



= I sup (K;P), (6) 



where l SU p(K; P) is the least upper bound of l(K; P) given by (g) over all 
possible probability distributions P p (p) of the input parameter P. 

Note that the asymptotic expression for M(L, e) in fact does not de- 
pend on e. This means that the number of distinct input signals (different 
vàlues of P) that can be distinguished with probability arbitrarily close 
to 1 is e^ sn P( K ' p \ The problem is reduced now to the computation of 
lsup(K; P) under the condition that P takes on vàlues in [pi,p2]- This 
problem is very difhcult, in general. However, the following important 
theorern will be helpful. 

Define individual information in P — p about K as 

m; P ) = ±P K/P (k/ P )^^^. (7) 

k — 

As is well known (e.g. ^)), I(K; P) achieves the maximum value Isup(K; P) 
for such a distribution Pp(p) that there exists a constant I such that 

ï(K;p)=I for all p such that Pp(p) > (8) 

and 

l(K;p) < I for all p such that P P (p) = 0. (9) 
Then l sup (K;P) = I. 
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2.3 The Number of Distinguishable States 

When n is large, the binomial distribution (^J) can be well-approximated 
by a Gaussian distribution: 

Pk/pWp) = ( n ) P k (l - p) n ~ k « 1 e -W^ . (io) 

For large n, distribution ( |10[ ) has a very sharp maximum at fe = pn, 
so that the Laplace method |ï|| can be used for evaluation of integrals 
involving ([ïo|), 

Consider a uniform distribution over the angle interval [02,01], 

Pa(q) = -. - r for tt£ [Q2,Qll. (11) 

|Q:i — Q2| 

The corresponding distribution of the probability P is 
da 



Pp(p) = P A («) 



where a, = cos -1 , i = 1, 2 . 



2\ai - a 2 \ 

(12) 

We will prové that for large n this distribution yields the maximum of 
1(K;P). The marginal probability distribution Px(fe) can be evaluated 
as follows (assuming p2 > pi): 

Px(fe) = / ¥ P {p)¥ K/P {k/p)d P 
J pi 

e 2p(i-p)" dp. (13) 



2|ai-a 2 |7 pi p(l - p) v / 27m 

If the point of maximum p = — of the exponential function in the inte- 
grand is within the interval [pi,pa]i the integration interval can be ex- 
tended to (—00,00). Otherwise, the value of the integral approaches zero 
when n tends to infinity. Thus, for large n we obtain: 

1 . if npi < k < np2 

2\ ai - a2 \y?k(n-k) ( 14 ) 

otherwise. 

Note that, as could be expected, the distribution of K for large n is the 
discrete counterpart of the distribution of P. Now we can evaluate the 
individual information l(K;p). 

l(K;p) « £ P^ fe /p)m^Ç|M 

fc=fnpi] 

JIU (fc-"p) 2 r n 

=e SRï^K [lnPjf/pífc/pJ-lnPirí*)] 



n P1 p(l-p)V^7m 

(15) 

The first term in ( |Ï5| ) is the differential entropy of a Gaussian distri- 
bution (with the opposite sign), the second one can be evaluated by the 
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Laplace method. Hence, asymptotically, 

l(K;p) = -iln[27rep(l-p)7i] + iln[p(l-p)n 2 ]+ln2|ai-a2| 

1 2n 

= -ln hlnlai — ad (16) 

2 ne 

Note that Ï(K\ p) takes on the same value for any p G [pi,ï>2]- Hence, 
distribution (y) (or (jïj)) is the optimal one for large n, and the maximum 
information I SU p(-?C; P) is expressed asymptotically as given below. 

1 2n 

lsup{K;P) = -ln— +ln|ai - oa\ (17) 
z 7re 

Thus, the number of distinguishable quantum states in the interval of 
angles [02, Qi] is proportional to the length of the interval and to ^fn. It 
does not depend on the position of the interval in the circle. 

W(n, ai , a2 ) = e lsu P (K ' p) = \ai-aa\J— (18) 

V ne 

Of course, the range of A may consist of several separated intervals. 
Then ( |Ï8| ) remains vàlid, as long as n is sufhciently large, so that each 
interval has many distinguishable states; also, \ai — a 2 \ should be replaced 
by the total length of the intervals. 

For given n, ( |ÏÍ|) achieves maximum if |qi — a 2 \ = n/2. Hence, 

W max (n) = yj^. (19) 

3 The iV-Dimensional Case 

Consider now a quantum system whose states are unit vectors in an N- 
dimensional complex Hilbert space C^. Choose an orthogonal basis in 
corresponding to a direct (von Neumann) measurement. Since all 
quantum states having the same projections on the basis vectors are indis- 
tinguishable by this measurement, the state space S 2N ~ 1 is reduced to the 
non-negative orthant of the unit sphere S N_1 in the real iV-dimensional 
Euclidean space R^. Each state vector is described now by TV Carte- 
sian coordinates x = (xi,X2, ■ ■ ■ ,xn), zC =i x "? = ^> an< ^ P l = x "? ' s ^ ne 
probability of the i-th outcome of the measurement. Suppose we want to 
distinguish between states chosen from a domain D of the non-negative 
orthant of S^ -1 , and assume we are allowed to perform the same mea- 
surement over n identical copies of each quantum state, where n>l. Let 
the quantum states be chosen with probability density function (p.d.f.) 
P P (p) = Pp(pi, . . . ,pjv), where ._, pj — 1. The outcome of such a 
measurement performed over n identical states is an JV-dimensional ran- 
dom variable K which takes on vàlues k = (fci, fo, • • • , fciv), where ki 
(i = 1, 2, ...,N) is the number of cases when the i-th result has been 
obtained. The conditional probability distribution of K given P is multi- 
nomial: 

I N 

Pic/p(fei, • ■ -,k m /pi, . . . ,p m ) = — — TTp^N (2°) 
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where J^- =1 fcj = n. 

Denote by Px(k) the marginal probability distribution of K. Then 
the information I(K; P) in K about P is given by an expression similar 
to (|): 



I(K; P) = f VPp (p)P K/P (k/p) ln ^ ( ,^( P) dp , 
i pE D , P K( k ) 



(21) 



where summation is taken over all k such that ^ ~ n - 

It follows from Shannon's 12th theorem that for any e > the maxi- 
mum number of distinct states W(iV, n, e) chosen from D in such a way 
that the probability of incorrect identification of the state based on the 
results K of the measurement does not exceed e satisfies the limit 



lim 

N — >oo n 



hm ï^fe£) = i. 



I sup (A-;P) 



(22) 



Here I S up(K;P) is the least upper bound of I(K;P) over all possible 
Pp(p). Note that, in contrast with the 2-dimensional case, there is no 
need to consider sequences of distinct states provided n and N are sufh- 
ciently large. 

Thus the number of distinct states (different vàlues of P) that can be 
distinguished with probability arbitrarily close to 1 is given by e^ su P^ K;P '. 
The computation of I SU p(K;P) can be performed along the same lines 
as in the 2-dimensional case. For large n {n/N 3> 1), the multinomial 
distribution (EOI) can be approximated by the iV-dimensional Gaussian 
distribution |3| 



K/P 



(k/P) 



'2 Zjj=l 



N (fc i -p i n)' ! 



( 2 ™) 2 ÍU=iPi 2 



(23) 



Consider the distribution Px(x) of the states which is uniform over the 
domain D. Denote the area of D by |D| — íï. Then 



Px(x) 




(24) 



for x £ D, and p x ( x ) = otherwise. We will show that for large n 
this distribution yields the maximum of I(K;P). Distribution ( pi| ) cor- 
responds to the following distribution of the random variable P over the 
domain D: 



Pp(Pi 



,PN) 



1 I Xl 




Pl, ■ ■ ■ ,PN 

2 N - l n 



(25) 
(26) 
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where J ( X1 '"' ,3:jv ) is the Jacobian of the coordinate transformation from 

\Pl PJV / 

x to p. The marginal probability distribution of K is given by 



P K (k 1 ,...,k N )= P P (p)P K /p(k/p)dpi...dpjv. (27) 
JD 

For large n, the integrand in ( p7j ) has a sharp maximum at p = k/n. 
Applying again the Laplace method we obtain: 

P K (fci,...,fciv)^ lll=lU; 2JV _ 1 ^ 1 (28) 

when — corresponds to a point in the domain D; otherwise Px(k) = 0. 
The individual information I(K; p) can be conveniently evaluated by use 
of "reduced" distributions P K '/p(k'/p) and p K /(k), where we take into 
account explicitly the dependence between the components of the vector 
k implied by the <5-function: 

exp[ _ 1 ijH^l. _ ( 1 - P -"-^ lfc ·) 2 ] 
p = .2^1 P< „ 2 Pmn j (2Q) 

p (k'i - tz^^ri^LÏ m 

PK(k) 2("-^nf- (30) 

Then 

I(K;p) =I(K';p) 
= / P K vp(k7p)[lnP K7 p(k7p)-lnP K ,(k')]dfci...dfc^ 1 

= Ii+I 2 . (31) 

The first term in ( |33~[ ) is simply the differential entropy (with the opposite 
sign) of a multivariate (N — l)-dimensional Gaussian distribution ( p9[ ) 
with the determinant of covariance matrix d = n^ -1 Y[ = i Pi- Hence 



I 1 = -iln[(2.e)-- 1 d]=-I 



In 



(2ixen) N 1 \\pí 



(32) 



The second term in ( pï| ) can be evaluated by the Laplace method, since 
the integrand has a sharp maximum at fcj = pi.n (i = 1, . . . , TV — 1). Hence 



N 

I 2 = lnn + ln(2n) JV - 1 + ilnJ|p l . (33) 



i = i 



Thus, I(K; p) = ln fl + ln — does not depend on p. This proves that 
the distribution ( pi| ) is the optimal one and the maximum information in 
K about P is asymptotically equal to 

I S u P (K;P) =lnfi+^— -ln— . (34) 
Z ne 
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The number of distinguishable states is given by the following expression: 



' In \ — 2 — 

W(N,n,n) =n(— . (35) 



KC 



Expression (^) turns into ( |Ï"Í| ) for N — 2. Indeed, it is easy to see that 
a uniform 2-dimensional distribution in Cartesian coordinates restricted 
to the non-negative quadrant of a unit circumference results in a uniform 
distribution over the polar angle a. Similarly, in the iV-dimensional case 
we obtain a uniform distribution over the area of the domain D, i.e. over 
the sòlid angle. 

The number of distinguishable states reaches a maximum (for given N 
and n) if D is the entire non-negative orthant of the iV-dimensional unit 
sphere. Since the area of the surface of the iV-dimensional unit sphere is 
2 7 r JV/2 {r(N/2)}-\ the area of the non-negative orthant (the sòlid angle) 
is 

TY N/2 

0max= 2"- 1 r(JV/2)' (36) 

where V is Euler's gamma-function. Thus the maximum number of dis- 
tinguishable states in iV-dimensional space is 

W^Wn) = r ^ y (J)^. (37) 

Remember that ( [Ü) ) and (^) are vàlid only when approximation (|Í^) is 
vàlid, i.e. when n^> N. 



4 Conclusion 

The main result of the paper can be summarized as follows. The number 
of distinguishable quantum states in a 2-dimensional Hilbert space is pro- 
portional to the number of identical copies of each state to the power =- 
and to the area Q of the domain of the unit sphere occupied by the state 
vectors. Surprisingly, it does not depend on the shape and the position 
of this domain, provided that the main assumption n/N 2> 1 is satisfied. 
The domain does not have to be connected: the results hold for a set of 
separate domains with the same total area O. The optimal distribution is 
uniform over the domain, which suggests that the states should be chosen 
at equal angular distances from each other. For the 2-dimensional case, 
the number of distinguishable states is proportional to the angular inter- 
val and to the square root of the number of identical copies of each state 
measured (cf. jij). 

The result that the number of distinguishable states is proportional 
to the geomètric distance as measured by angle in Hilbert space is quite 
nontrivial and noteworthy. Indeed, it suggests that the mètric of Hilbert 
space may result not from a physical principle, but rather as a consequence 
of an optimal statistical inference procedure. 
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